Federation of Bosnia and Herzegovina
2025 AAAI / ACM SIGAI Doctoral Consortium interviews compilation
Authors pictured in order of their interview publication date (left to right, top to bottom). Each year, a small group of PhD students are chosen to participate in the AAAI/SIGAI Doctoral Consortium . This initiative provides an opportunity for the students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. During 2025, we met with some of the students to find out more about their research and the doctoral consortium experience. Kunpeng Xu completed his PhD at the Université de Sherbrooke and is now a postdoctoral fellow at McGill University.
Higher-arity PAC learning, VC dimension and packing lemma
Chernikov, Artem, Towsner, Henry
The aim of this note is to overview some of our work in Chernikov, Towsner'20 (arXiv:2010.00726) developing higher arity VC theory (VC$_n$ dimension), including a generalization of Haussler packing lemma, and an associated tame (slice-wise) hypergraph regularity lemma; and to demonstrate that it characterizes higher arity PAC learning (PAC$_n$ learning) in $n$-fold product spaces with respect to product measures introduced by Kobayashi, Kuriyama and Takeuchi'15. We also point out how some of the recent results in arXiv:2402.14294, arXiv:2505.15688, arXiv:2509.20404 follow from our work in arXiv:2010.00726.
Interview with Amina Mević: Machine learning applied to semiconductor manufacturing
In a series of interviews, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. In this latest interview, we hear from Amina Mević who is applying machine learning to semiconductor manufacturing. Find out more about her PhD research so far, what makes this field so interesting, and how she found the AAAI Doctoral Consortium experience. I am currently pursuing my PhD at the University of Sarajevo, Faculty of Electrical Engineering, Department of Computer Science and Informatics. My research is being carried out in collaboration with Infineon Technologies Austria as part of the Important Project of Common European Interest (IPCEI) in Microelectronics.
Deep Learning Models for Physical Layer Communications
The increased availability of data and computing resources has enabled researchers to successfully adopt machine learning (ML) techniques and make significant contributions in several engineering areas. ML and in particular deep learning (DL) algorithms have shown to perform better in tasks where a physical bottom-up description of the phenomenon is lacking and/or is mathematically intractable. Indeed, they take advantage of the observations of natural phenomena to automatically acquire knowledge and learn internal relations. Despite the historical model-based mindset, communications engineering recently started shifting the focus towards top-down data-driven learning models, especially in domains such as channel modeling and physical layer design, where in most of the cases no general optimal strategies are known. In this thesis, we aim at solving some fundamental open challenges in physical layer communications exploiting new DL paradigms. In particular, we mathematically formulate, under ML terms, classic problems such as channel capacity and optimal coding-decoding schemes, for any arbitrary communication medium. We design and develop the architecture, algorithm and code necessary to train the equivalent DL model, and finally, we propose novel solutions to long-standing problems in the field.
Diffusion Instruction Tuning
Jin, Chen, Tanno, Ryutaro, Saseendran, Amrutha, Diethe, Tom, Teare, Philip
We introduce Lavender, a simple supervised fine-tuning (SFT) method that boosts the performance of advanced vision-language models (VLMs) by leveraging state-of-the-art image generation models such as Stable Diffusion. Specifically, Lavender aligns the text-vision attention in the VLM transformer with the equivalent used by Stable Diffusion during SFT, instead of adapting separate encoders. This alignment enriches the model's visual understanding and significantly boosts performance across in- and out-of-distribution tasks. Lavender requires just 0.13 million training examples, 2.5% of typical large-scale SFT datasets, and fine-tunes on standard hardware (8 GPUs) in a single day. It consistently improves state-of-the-art open-source multimodal LLMs (e.g., Llama-3.2-11B, MiniCPM-Llama3-v2.5), achieving up to 30% gains and a 68% boost on challenging out-of-distribution medical QA tasks. By efficiently transferring the visual expertise of image generators with minimal supervision, Lavender offers a scalable solution for more accurate vision-language systems. All code, training data, and models will be shared at https://astrazeneca.github.io/vlm/.
Evaluating Large Language Models Against Human Annotators in Latent Content Analysis: Sentiment, Political Leaning, Emotional Intensity, and Sarcasm
Bojic, Ljubisa, Zagovora, Olga, Zelenkauskaite, Asta, Vukovic, Vuk, Cabarkapa, Milan, Jerkovic, Selma Veseljević, Jovančevic, Ana
In the era of rapid digital communication, vast amounts of textual data are generated daily, demanding efficient methods for latent content analysis to extract meaningful insights. Large Language Models (LLMs) offer potential for automating this process, yet comprehensive assessments comparing their performance to human annotators across multiple dimensions are lacking. This study evaluates the reliability, consistency, and quality of seven state-of-the-art LLMs, including variants of OpenAI's GPT-4, Gemini, Llama, and Mixtral, relative to human annotators in analyzing sentiment, political leaning, emotional intensity, and sarcasm detection. A total of 33 human annotators and eight LLM variants assessed 100 curated textual items, generating 3,300 human and 19,200 LLM annotations, with LLMs evaluated across three time points to examine temporal consistency. Inter-rater reliability was measured using Krippendorff's alpha, and intra-class correlation coefficients assessed consistency over time. The results reveal that both humans and LLMs exhibit high reliability in sentiment analysis and political leaning assessments, with LLMs demonstrating higher internal consistency than humans. In emotional intensity, LLMs displayed higher agreement compared to humans, though humans rated emotional intensity significantly higher. Both groups struggled with sarcasm detection, evidenced by low agreement. LLMs showed excellent temporal consistency across all dimensions, indicating stable performance over time. This research concludes that LLMs, especially GPT-4, can effectively replicate human analysis in sentiment and political leaning, although human expertise remains essential for emotional intensity interpretation. The findings demonstrate the potential of LLMs for consistent and high-quality performance in certain areas of latent content analysis.